Asking whether parameters can be identified from one or several datasets is an important step towards sound model development. Most mechanistic predator-prey modelling has attempted either parameterization from process rate data or inverse modelling (identifying parameters solely from time series). Here, we take a median road: we aim at identifying the potential benefits of combining datasets, when both population growth and predation processes are viewed as stochastic. We fit a discrete-time, stochastic predator-prey model of the Leslie type to time series and functional response data simulated from the model. Our model has both environmental stochasticity in the growth rates and interaction stochasticity, i.e., a stochastic functional response. We examine what the functional response data brings to the quality of the estimates, and whether estimation would be possible (for various time series length) solely with time series data. Both bayesian and frequentist estimation are performed, and in both cases we report diagnostics of identifiability of the various parameters. Our results suggest that if the functional response is indeed only slightly stochastic, identifying parameters requires in fact functional response data as a complement to time series of counts [more stuff on results here]. Our framework to combine data sets is general may be extended to other interaction types for which data on both interaction rates and population counts are available.
tentative list
Many attempts to fit predator-prey models to data assume observation stochasticity and do inverse modelling (e.g., Jost & Arditi 2001 or recently, DeLong et al. (2018); Rosenbaum et al. 2019). Inverse modelling from time series can be hard due to identifiability issues (Eisenberg & Hayashi 2014), is therefore fraud with considerable uncertainties. Even in simple, phenomenological statistical models for two species, there can remain considerable uncertainty about the model structure and parameters (Barraquand & Nielsen 2018).
One way to decrease uncertainty is to use combination of data sets to decrease that uncertainty on the parameters.
This may be best done in a fully stochastic context: there is widespread environmental stochasticity (which may dominate). REFs stochastic predator-prey models.
Stochasticity also affects the functional response for various reasons. Point clouds are often anything but just a type II curve (e.g., look at data points plotted in Fig. 5 of Rosenbaum & Rall 2018). Considering the FR as a stochastic object is therefore both biologically realistic and statistically convenient. In fact, some researchers have chosen to move away from the classical food web modelling paradigm because of its reliance on deterministic functional relationship (Planque et al. 2014). To our knowledge, embracing fully the stochasticity of the functional response in an inferential context has only been done by Gilioli et al. (2008),Gilioli et al. (2012). However, the continuous-time formulation that they used resulted in considerable model complexity.
Here we consider a ground truth model with both stochasticity in growth rates and interactions parameters, in discrete time. The model is fitted to both time series and functional response data. We use both frequentist and bayesian estimation paradigms, and check that our results are congruent. Identifiability is examined in both cases.
We consider first long time series, close to perfect information (T=1000) and then time series of realistic ecological length (T= 100, T = 50 or 25). [the latter two remain to be done]. Yet to do: we vary the percentage of data points attributed to the time series vs the functional response.
We chose a model with a numerical response of the Leslie type (though similar analyses can be done for Rosenzweig MacArthur models can be done as well, see Supplement S1). A Beverton-Holt type density-dependence for the prey was chosen to avoid cycles in the prey in absence of the predator, so that the model behaviour is more reminiscent of its continuous-time counterpart (see Weide et al. for more on connecting continous to discrete time models). Our model writes
\[ \begin{aligned} N_{t+1} & = N_{t}\frac{e^{r+\epsilon_{1t}}}{1+\gamma N_{t}}\exp\left(-G(N_{t})\frac{P_{t}}{N_{t}}\right),\,\epsilon_{1t}\sim\mathcal{N}(0,\sigma_{1}^{2})\label{eq:prey_discreteLeslieMay} \end{aligned} \]
\[ \begin{aligned} P_{t+1} & = P_{t}\frac{e^{s+\epsilon_{2t}}}{1+qP_{t}/N_{t}},\,\epsilon_{1t}\sim\mathcal{N}(0,\sigma_{2}^{2})\label{eq:predator_discreteLeslieMay} \end{aligned} \]
The roots of this discrete-time formulation can be traced back to Leslie (1948),Leslie & Gower (1960) who included a Beverton-Holt regulation for the prey and predator to make discrete-time models more similar to their continous-time counterparts. We consider additionally a saturating functional response \(G(N_t)\) here, which makes our model resembles the continuous time models of Tanner (1975) or May (1973) and later Turchin & Hanski (1997), whose notations we have kept. One of the advantages of this model over the Rosenzweig-MacArthur is that it can be parameterized using the observed intrinsic growth rates as Tanner (1975) did. Parameter values were inspired by small mammals (Turchin & Hanski 1997), which we modified slightly [these are not the same models anyway]. The division by \(N_t\) in \(\exp(-G(N_t)P_t/N_t)\) expresses the fact that all terms within the exponential are on the prey fitness scale (per capita mortality). This is similar to assuming a Nicholson-Bailey type predation term (e.g., Weide et al. 2018).
Until now, we have not specified a model for the functional response \(G(N_{t})\). With a deterministic functional response (FR), we have a classic stochastic predator-prey model with log-normal environmental noise but an otherwise `deterministic skeleton’. A stability analysis of the deterministic model is performed in Appendix A1, later used to determine which parameters lead to quasi-cycles or a noisy limit cycles in the stochastic model.
However, here we consider a data-generating process where the functional response is not deterministic but itself stochastic, as in the following equation for a Holling type II functional response:
\[ \begin{equation}\label{eq:FR} G_{t}|N_t\sim\mathcal{N}\left(\frac{CN_{t}}{D+N_{t}},\sigma_{3}^2\right) \end{equation} \]
This corresponds to a case where there is mild Gaussian fluctuations around the functional response. Because there can be substantial noise on the FR (see e.g. plots in Fig. 5 of Rosenbaum & Rall 2018), we also consider more complicated cases where the parameters \(C\) and \(D\) are themselves allowed to vary in time in Supplement S2 [I was thinking to only do this in a bayesian setting for the estimation part, since these models need more constraining].
Although we apply here data integration to a predator-prey case, the methodology is more general and can in principle be applied to any addiition of auxiliary information (interaction rate, demographic rate) which is available over time and not simply produced by the count data. The approach has similarities to Integrated Population Modelling (Besbeas et al. 2002; Maunder 2004; Péron & Koons 2012) and data fusion approaches in ecosystem science (Niu et al. 2014). In a predator-prey context, we can cite the recent endeavour of Ferguson et al. (2018) to combine count and isotopic (diet) data, but their model differ in that it does not view interaction rate as the result of a stochastic process.
We illustrate the method with the predator-prey case where log-densities for both predator and prey are gathered in a log-count vector \(\mathbf{x}_{t}=(\ln(N_{t}),\ln(P_{t}))^{T}\). Auxiliary information on the functional response, or rather the observed kill rate per predator is denoted \(G_t\). To this can be added other demographic (vital) rates \(R_{t}\) that are stacked in a vector as well, \(\mathbf{a}_{t}=(G_{t}, R_{t})\). Currently we use \(\mathbf{a}_{t}=G_{t}\) but it may useful to add more information in other applications; hence the derivation is kept general.
We consider a discrete-time dynamical system (nonlinear difference equation). The population dynamics part of the model gives us the probability law of \(\mathbf{x}_{t+1}|(\mathbf{x}_{t},\mathbf{a}_{t})\), since the counts at time \(t+1\) depend both on past abundances and the interaction or demographic data based on the chosen mechanistic model. We also know the probability law of \(\mathbf{a}_{t}|\mathbf{x}_{t}\) (in our simple predator-prey case, the functional response). We can therefore write down easily the joint likelihood for both data sources in quite general terms, denoting \(\mathbf{X}=(\mathbf{x}_{1},...,\mathbf{x}_{t_{m}})\) and \(\mathbf{A}=(\mathbf{a}_{1},...,\mathbf{a}_{t_{m}})\):
\[ \begin{equation} \mathcal{L}(\mathbf{X},\mathbf{A})=p(\mathbf{x}_{1},\mathbf{a}_{1})\prod_{t=1}^{t_{m}-1}p_{C}(\mathbf{x}_{t+1},\mathbf{a}_{t+1}|\mathbf{x}_{t},\mathbf{a}_{t}) \end{equation} \]
where \(p(y)\) and \(p_{C}(y)\) are continuous probability densities for the vector \((\mathbf{x},\mathbf{a})\) and its conditional pdf, respectively. The conditional pdf can be further decomposed using the chain rule
\[ p(\mathbf{x}_{t+1},\mathbf{a}_{t+1}|\mathbf{x}_{t},\mathbf{a}_{t})=p_{2}(\mathbf{a}_{t+1}|\mathbf{x}_{t+1},\mathbf{x}_{t},\mathbf{a}_{t})\times p_{1}(\mathbf{x}_{t+1}|\mathbf{a}_{t},\mathbf{x}_{t})=p_{2}(\mathbf{a}_{t+1}|\mathbf{x}_{t+1})\times p_{1}(\mathbf{x}_{t+1}|\mathbf{a}_{t},\mathbf{x}_{t}) \]
where \(p_{1}(y)\) is given by the dynamical system and \(p_{2}(y)\) is the functional response model (or a demographic model). We therefore end up with a model
\[ \begin{equation}\label{full-likelihood} \mathcal{L}(\mathbf{X},\mathbf{A})=p_{1}(\mathbf{a}_{1}|\mathbf{x}_{1})p(\mathbf{x}_{1})\prod_{t=1}^{t_{m}-1}\underbrace{p_{1}(\mathbf{x}_{t+1}|\mathbf{a}_{t},\mathbf{x}_{t})}_{\text{dynamical system}}\times\underbrace{p_{2}(\mathbf{a}_{t+1}|\mathbf{x}_{t+1})}_{\text{auxiliary information model}} \end{equation} \]
where we swapped \(p_{1}\) and \(p_{2}\) to get the dynamical system model first.
In the simplest case highlighted by our discrete-time dynamical systems of the sections above, \(p_{1}(x|y)=p_{11}(x_{1}|y)p_{12}(x_{2}|y)\) is the product of the two gaussian pdf for log-densities conditional on past densities and auxiliary information. Using the equations \eqref{eq:prey_discreteLeslieMay} and \eqref{eq:predator_discreteLeslieMay}, we obtain the following difference equation on the log-scale:
\[ \begin{equation} n_{t+1}|\mathbf{x}_{t}=\ln(N_{t+1})|\mathbf{x}_{t}\sim\mathcal{N}(\mu_{1t},\sigma_{1}^{2})\,\,,\mu_{1t}=n_{t}+r-G_{t}\frac{P_{t}}{N_{t}}-\ln(1+\gamma N_{t}) \end{equation} \] and \[ \begin{equation} p_{t+1}|\mathbf{x}_{t}=\ln(P_{t+1})|\mathbf{x}_{t}\sim\mathcal{N}(\mu_{2t},\sigma_{2}^{2})\,\,,\mu_{2t}=p_{t}+s-\ln(1+qP_{t}/N_{t}) \end{equation} \] with a functional response model (\(p_{2}\)) also given by a Gaussian pdf (this is the simplest case where we assume near Gaussian noise on the FR, see Discussion)
\[ G_{t}|N_t \sim\mathcal{N}(\mu_{3t},\sigma_{3}^{2}),\;\mu_{3t}=\frac{CN_{t}}{D+N_{t}} \]
We considered the following parameter sets for the model [illustration needed, 4 panels: (N(t), P(t), F(N), N vs P) for each parameter set, illustrate the first one in the MS, second one in Appendix?]
| Parameter | Quasi-cycles | Noisy LC |
|---|---|---|
| C | 2.5 | 15.0 |
| D | 1.0 | 0.25 |
| Q | 10 | 10 |
The rest of the parameters are \(r=2,s=0.5,K=1,\sigma_1=\sigma_2=\sigma_3=0.05\).
The first parameter set corresponds to a forced focus or quasi-cycles, i.e. sustained oscillations that arise the interaction between noise and dampened oscillations to the fixed point in the deterministic model. We also consider a noisy limit cycle, i.e., parameters that give rise to a limit cycle without the noise, so that the cycle is still very regular but perturbed by the noise.
These two sets of parameters are crossed with several modalities of data availability: - we consider a time series length T=1000, 100 [should I add T=50 and 25?] - we consider that we have functional response data over 100%, 25%, or 0% of the time series data. This is meant to emulate common scenarios in which the kill rates are not monitored over the whole time series, and quantify the benefits of adding just a little FR data.
Note that in the case without FR data, we fit the model without noise in the functional response [this is something we need to discuss in depth Olivier], which is what is usually done in this case (Turchin & Ellner (2000); newer stuff?).
For each parameter x data availability scenario, we fit the models in both bayesian and frequentist settings.
We fitted the model by MCMC in JAGS and also derived mathematically its likelihood, which we then optimised using the BFGS algorithm [see comments below on Nelder-Mead as well] with optim() in R. Several starting values were considered.
Identifiability was inspected in different ways. In a frequentist setting, we computed the Hessian matrix of the negative log-likelihood \(H(\theta,Y)\) where \(Y = (X,A)\) following eq. \eqref{full-likelihood}, at the estimated parameter set value \(\hat{\theta}\). Formally, \(H(\theta,Y)\) is the observed Fisher Information Matrix (FIM). When the Hessian \(H(\theta,Y)\) has non-zero eigenvalues (and is therefore invertible), the models are identifiable in the sense that no two parameter sets can give the same log-likelihood value in a neighborhood of the optimum (Rothenberg 1971). We also compute the expected FIM at the true parameter value. This value is relevant theoretically because asymptotically, as the time series length \(T\) gets large, \(\theta \sim \mathcal{N}(\theta_{\text{true}},\mathcal{I}_T(\theta_{\text{true}})^{-1})\) where \(\mathcal{I}_T(\theta_{\text{true}})\) is the FIM for a time series of length \(T\), obtained by taking expectations over many time series of length \(T\). In the case where the time series length \(T \rightarrow \infty\), we have most likely a convergence of \(H(\theta,y^T) \rightarrow \mathcal{I}_T(\theta_{\text{true}})\) [should I try to prove this properly btw? I think this should be doable by splitting the long time series into smaller chunks, and then proving that the combined log-likelihood is obtained by summing LL for all chunks. Perhaps we can use - among other things - that for a partition \((T_1,..., T_n)\) of the time series of total length \(T\) into chunks of size \(T/n\), we have \(\mathcal{I}_T(\theta_{\text{true}}) = n \mathcal{I}_{T/n}(\theta_{\text{true}})\)].
We also computed the expected pairwise correlation between the parameters: the variance-covariance matrix of the parameters is defined by \(\Sigma = H^{-1}\) so that we get easily the expected pairwise parameter correlation matrix \((\rho_{ij})\), with each element defined as \(\frac{\Sigma_{ij}}{\sqrt{\Sigma_{ii}\Sigma_{jj}}}\).
In a bayesian setting, we inspected the correlations in the Markov chains for pairs of parameters, which translates into pair posterior distributions of parameters. Parameters whose chains were too positively or negatively correlated were considered not identifiable separately.
[so far this is very preliminary and I do not consider all the abovementioned modalities of the analysis. We focus for now on T=1000 and the comparison with/without FR data. In both bayesian and frequentist setting]
Here, we describe how to simulate the stochastic model [may be ignored in the paper but useful at that stage].
rm(list=ls())
### Parameters for simulation model
n.years<-1000 # Number of years - 25 first, perhaps use 50 or 100 / worked very well with almost no process error on the real scale
N1<-1 # Initial pop size
P1<-0.1
K<-1 # threshold dd
beta<-1 # density-dependence exponent
rmax_V<-2 # Max AVERAGE growth rate (thus not a true max...)
rmax_P<-0.5
sigma2.proc<-0.05 # worked well with 0.005 as well
# Process sigma on the log-scale, use the Peretti et al. value. 0.005
### FR and predator parameters
C<-2.5
D<-1
Q<-10
### Simulation of data
#set.seed(42)
set.seed(41)
y<-N<-P<-FR<-numeric(n.years)
N[1]<-N1
P[1]<-P1
FR[1]<-C*N[1]/(D+N[1])
rV<-rnorm(n.years-1,rmax_V,sqrt(sigma2.proc))
rP<-rnorm(n.years-1,rmax_P,sqrt(sigma2.proc))
FRnoise<-rnorm(n.years,0,sqrt(sigma2.proc))
for (t in 1:(n.years-1)){
N[t+1]<-N[t]*(exp(rV[t])/(1+(N[t]/K)^beta))*exp(-FR[t]*P[t]/N[t])
P[t+1]<-P[t]*exp(rP[t])/(1+P[t]*Q/N[t])
FR[t+1]<-(C*N[t+1]/(D+N[t+1])) + FRnoise[t+1] #after for update
}
## Plotting time series of abundances and FR
par(mfrow=c(2,2))
plot(1:n.years,N,type="b")
plot(1:n.years,P,type="b")
#curve(dbeta(x,a,b),from=0, to=1)
plot(N,FR)
plot(log(N),log(P))
Below we reproduce the JAGS code that was used to fit the model to both predator-prey count data and FR data.
library("R2jags") # Load R2jags package
#seq(0,1,0.01)
# Bundle data
jags.data <- list(T=n.years,logN=log(N),logP=log(P),FR=FR)
sink("predprey.txt")
cat("
model {
# Priors and constraints
logN[1] ~ dnorm(0,0.01) # Prior on initial pop size on the log scale
logP[1] ~ dnorm(0,0.01) # Prior on initial pop size on the log scale
# Priors prey population dynamics
r_V ~ dnorm(1,0.001) # below the truth, rather flat prior
K_V ~ dunif(0.2,10)
sigma_V ~ dunif(0.01,5) # rather vague
sigma2_V<-pow(sigma_V, 2)
tau_V<-pow(sigma_V,-2)
#Priors predator population dynamics
Q ~ dgamma(0.1,0.1)
r_P ~ dnorm(1,0.1)
sigma_P ~ dunif(0.01,2) # rather vague
sigma2_P<-pow(sigma_P, 2)
tau_P<-pow(sigma_P,-2)
#Priors predation parameters
tau_FR ~ dgamma(.01,.01)
C~dgamma(.01,.01) # uninformative priors OK for that one
D~dgamma(.01,.01)
# check model Leslie to see how she specified priors...
# Likelihood
# state process
for (t in 1:(T-1)){
FRUpdate[t] <- C*N[t]/(D+N[t]) #functional response equation, including noise
FR[t] ~ dnorm(FRUpdate[t],tau_FR) #small trick to use FR data
logNupdate[t] <- logN[t] + r_V -log(1+N[t]/K_V) -FR[t]*exp(logP[t])/N[t]
logN[t+1] ~ dnorm(logNupdate[t],tau_V)
N[t]<-exp(logN[t])
# for some reason, log(1+(exp(r_V)-1)*N[t]/K_V) was not working
logP[t+1]~ dnorm(logPupdate[t],tau_P)
logPupdate[t] <- logP[t] + r_P - log(1+exp(logP[t])*Q/exp(logN[t]) )
}
}
",fill=TRUE)
##
## model {
##
## # Priors and constraints
## logN[1] ~ dnorm(0,0.01) # Prior on initial pop size on the log scale
## logP[1] ~ dnorm(0,0.01) # Prior on initial pop size on the log scale
##
## # Priors prey population dynamics
## r_V ~ dnorm(1,0.001) # below the truth, rather flat prior
## K_V ~ dunif(0.2,10)
## sigma_V ~ dunif(0.01,5) # rather vague
## sigma2_V<-pow(sigma_V, 2)
## tau_V<-pow(sigma_V,-2)
##
##
## #Priors predator population dynamics
## Q ~ dgamma(0.1,0.1)
## r_P ~ dnorm(1,0.1)
## sigma_P ~ dunif(0.01,2) # rather vague
## sigma2_P<-pow(sigma_P, 2)
## tau_P<-pow(sigma_P,-2)
##
## #Priors predation parameters
## tau_FR ~ dgamma(.01,.01)
## C~dgamma(.01,.01) # uninformative priors OK for that one
## D~dgamma(.01,.01)
## # check model Leslie to see how she specified priors...
##
## # Likelihood
## # state process
##
## for (t in 1:(T-1)){
##
## FRUpdate[t] <- C*N[t]/(D+N[t]) #functional response equation, including noise
## FR[t] ~ dnorm(FRUpdate[t],tau_FR) #small trick to use FR data
##
## logNupdate[t] <- logN[t] + r_V -log(1+N[t]/K_V) -FR[t]*exp(logP[t])/N[t]
## logN[t+1] ~ dnorm(logNupdate[t],tau_V)
## N[t]<-exp(logN[t])
## # for some reason, log(1+(exp(r_V)-1)*N[t]/K_V) was not working
##
## logP[t+1]~ dnorm(logPupdate[t],tau_P)
## logPupdate[t] <- logP[t] + r_P - log(1+exp(logP[t])*Q/exp(logN[t]) )
##
## }
##
## }
##
sink()
# Initial values
inits <- function () {
list(sigma_V=runif(1,0.1,2), sigma_P=runif(1,0.1,2), r_V=runif(1,0.1,2),r_P=runif(1,0.1,2), K_V=runif(1,0.2,8), Q=runif(1,0,5),tau_FR=runif(1,1,10),C=runif(1,10,100),D=runif(1,0.01,0.1))}
# Parameters monitored
#parameters<-c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","a","b","C","D","logNupdate","logPupdate","FR")
parameters<-c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","tau_FR","C","D")
# MCMC settings
nc <- 3 #number of chains
nb <- 14000 # “burn in”
#ni <- 14000# “number of iterations” # that's for a symmetric distrib...
ni<-34000
nt <- 10 # “thinning”
We do not evaluate it here because this is too long, we load the data later on.
# run model
out <- jags(jags.data, inits, parameters, "predprey.txt", n.chains=nc, n.thin=nt, n.iter=ni, n.burnin=nb, working.directory = getwd())
print(out, dig = 2)
The model without the functional response is the following:
### Now try to fit a model without the FR data.
sink("predprey_without_sepFR.txt")
cat("
model {
# Priors and constraints
logN[1] ~ dnorm(0,0.01) # Prior on initial pop size on the log scale
logP[1] ~ dnorm(0,0.01) # Prior on initial pop size on the log scale
# Priors prey population dynamics
r_V ~ dnorm(1,0.001) # below the truth, rather flat prior
K_V ~ dunif(0.2,10)
sigma_V ~ dunif(0.01,5) # rather vague
sigma2_V<-pow(sigma_V, 2)
tau_V<-pow(sigma_V,-2)
#Priors predator population dynamics
Q ~ dgamma(0.1,0.1)
r_P ~ dnorm(1,0.1)
sigma_P ~ dunif(0.01,2) # rather vague
sigma2_P<-pow(sigma_P, 2)
tau_P<-pow(sigma_P,-2)
#Priors predation parameters
C~dgamma(.01,.01) #
D~dgamma(.01,.01)
# check model Leslie to see how she specified priors...
# Likelihood
# state process
for (t in 1:(T-1)){
logNupdate[t] <- logN[t] + r_V -log(1+N[t]/K_V) -C*exp(logP[t])/(D+N[t])
logN[t+1] ~ dnorm(logNupdate[t],tau_V)
N[t]<-exp(logN[t])
# for some reason, log(1+(exp(r_V)-1)*N[t]/K_V) was not working
logP[t+1]~ dnorm(logPupdate[t],tau_P)
logPupdate[t] <- logP[t] + r_P - log(1+exp(logP[t])*Q/exp(logN[t]) )
}
}
",fill=TRUE)
##
## model {
##
## # Priors and constraints
## logN[1] ~ dnorm(0,0.01) # Prior on initial pop size on the log scale
## logP[1] ~ dnorm(0,0.01) # Prior on initial pop size on the log scale
##
## # Priors prey population dynamics
## r_V ~ dnorm(1,0.001) # below the truth, rather flat prior
## K_V ~ dunif(0.2,10)
## sigma_V ~ dunif(0.01,5) # rather vague
## sigma2_V<-pow(sigma_V, 2)
## tau_V<-pow(sigma_V,-2)
##
##
## #Priors predator population dynamics
## Q ~ dgamma(0.1,0.1)
## r_P ~ dnorm(1,0.1)
## sigma_P ~ dunif(0.01,2) # rather vague
## sigma2_P<-pow(sigma_P, 2)
## tau_P<-pow(sigma_P,-2)
##
## #Priors predation parameters
##
## C~dgamma(.01,.01) #
## D~dgamma(.01,.01)
## # check model Leslie to see how she specified priors...
##
## # Likelihood
## # state process
##
## for (t in 1:(T-1)){
##
##
## logNupdate[t] <- logN[t] + r_V -log(1+N[t]/K_V) -C*exp(logP[t])/(D+N[t])
## logN[t+1] ~ dnorm(logNupdate[t],tau_V)
## N[t]<-exp(logN[t])
##
## # for some reason, log(1+(exp(r_V)-1)*N[t]/K_V) was not working
## logP[t+1]~ dnorm(logPupdate[t],tau_P)
## logPupdate[t] <- logP[t] + r_P - log(1+exp(logP[t])*Q/exp(logN[t]) )
##
## }
##
## }
##
We assume in this second model that the FR is deterministic not stochastic. This is something that we might discuss.
Both models are fairly long to run for T=1000 (approx. 1h) - so we stored the results instead (not in GitHub, a bit heavy).
Now we analyse those results, starting with what the functional response look like. In orange, a least-square fit to the FR data, in blue, the fitted FR data for the Bayesian model with stochastic functional response, in red, the fitted model for the Bayesian model with deterministic functional response (hence indirectly inferred from the time series). This reveals serious mismatch for the red curve whilst the blue one is very good.
CEb<-out$BUGSoutput$mean$C
DEb<-out$BUGSoutput$mean$D
CEr<-out2$BUGSoutput$mean$C
DEr<-out2$BUGSoutput$mean$D
### Fit functional response
par(mfrow=c(1,1))
fr_fit<-nls(FR~CE*N/(DE+N),start=list(CE=1,DE=1))
CE<-coef(fr_fit)[1]
DE<-coef(fr_fit)[2]
plot(N,FR,ylim=c(0,max(FR,na.rm=T)))
lines(N,CE*N/(DE+N),col="orange") # Frequentist fit of just the FR data in black
lines(N,CEb*N/(DEb+N),col="blue") # Bayesian fit TS data + FR data
lines(N,CEr*N/(DEr+N),col="red") # Bayesian fit only TS data
### plot densities
library(mcmcplots)
### here is a density plot for the stochastic FR model
denplot(out,c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","C","D"))
We now plot the posterior densities for the deterministic FR model
denplot(out2,c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","C","D"))
Traceplots for the stochastic FR model, showing good convergence
### Trace plots
traplot(out,c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","C","D"))
Traceplots for the deterministic FR model, showing wild excursions in parameter space
traplot(out2,c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","C","D"))
Let’s compare diagnostics
For the stochastic FR model
print(out,dig=2)
## Inference for Bugs model at "predprey.txt", fit using jags,
## 3 chains, each with 34000 iterations (first 14000 discarded), n.thin = 10
## n.sims = 6000 iterations saved
## mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat
## C 2.51 0.04 2.43 2.48 2.51 2.54 2.60 1.00
## D 1.01 0.10 0.82 0.94 1.01 1.08 1.22 1.00
## K_V 0.98 0.19 0.63 0.84 0.97 1.11 1.37 1.07
## Q 10.27 0.80 8.77 9.72 10.24 10.80 11.92 1.00
## r_P 0.50 0.03 0.44 0.48 0.50 0.53 0.57 1.00
## r_V 2.04 0.17 1.74 1.91 2.02 2.16 2.40 1.06
## sigma2_P 0.05 0.00 0.05 0.05 0.05 0.06 0.06 1.00
## sigma2_V 0.05 0.00 0.05 0.05 0.05 0.05 0.05 1.00
## tau_FR 20.71 0.93 18.92 20.07 20.69 21.34 22.57 1.00
## deviance -429.74 4.16 -435.87 -432.80 -430.41 -427.34 -419.93 1.00
## n.eff
## C 6000
## D 6000
## K_V 43
## Q 810
## r_P 880
## r_V 45
## sigma2_P 3500
## sigma2_V 6000
## tau_FR 6000
## deviance 610
##
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
##
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 8.6 and DIC = -421.1
## DIC is an estimate of expected predictive error (lower deviance is better).
For the deterministic FR model
print(out2,dig=2)
## Inference for Bugs model at "predprey_without_sepFR.txt", fit using jags,
## 3 chains, each with 34000 iterations (first 14000 discarded), n.thin = 10
## n.sims = 6000 iterations saved
## mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat
## C 8.91 14.52 1.44 1.91 2.19 6.41 53.38 2.70
## D 15.83 33.25 0.00 0.00 0.00 10.62 119.90 2.86
## K_V 1.16 0.51 0.55 0.77 0.97 1.50 2.38 2.86
## Q 10.27 0.80 8.79 9.71 10.23 10.78 11.94 1.00
## r_P 0.50 0.03 0.45 0.48 0.50 0.53 0.57 1.00
## r_V 1.96 0.34 1.34 1.67 2.01 2.22 2.55 2.82
## sigma2_P 0.05 0.00 0.05 0.05 0.05 0.06 0.06 1.00
## sigma2_V 0.05 0.00 0.05 0.05 0.05 0.05 0.05 1.00
## deviance -231.95 4.06 -239.16 -234.67 -232.32 -229.63 -222.79 1.16
## n.eff
## C 4
## D 4
## K_V 4
## Q 3200
## r_P 2700
## r_V 4
## sigma2_P 6000
## sigma2_V 1600
## deviance 18
##
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
##
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 7.3 and DIC = -224.7
## DIC is an estimate of expected predictive error (lower deviance is better).
[I think in this context, if we fit the deterministic FR model to a stochastic FR simulation, that it makes sense to stick to a small noise on the FR, so that people see how a seemingly small noise can change the results]
Let us now consider potential correlations in the posteriors
### Plot pair posterior densities
postsamples=cbind(out$BUGSoutput$sims.list$r_V,
out$BUGSoutput$sims.list$K_V,
out$BUGSoutput$sims.list$r_P,
out$BUGSoutput$sims.list$Q,
out$BUGSoutput$sims.list$C,
out$BUGSoutput$sims.list$D)
#png(file="PairPosteriorPlot_withFRdata.png", width = 1200, height = 1200,res=300)
pairs(postsamples,c("r_V","K_V","r_P","Q","C","D"))
#dev.off()
postsamples2=cbind(out2$BUGSoutput$sims.list$r_V,
out2$BUGSoutput$sims.list$K_V,
out2$BUGSoutput$sims.list$r_P,
out2$BUGSoutput$sims.list$Q,
out2$BUGSoutput$sims.list$C,
out2$BUGSoutput$sims.list$D)
#png(file="PairPosteriorPlot_withoutFRdata.png", width = 1200, height = 1200, res=300)
pairs(postsamples2,c("r_V","K_V","r_P","Q","C","D"))
#dev.off()
#pdf(file="PairCorrelPosteriorPlot.pdf",width = 4,height = 8)
par(mfrow=c(2,1))
parcorplot(out,parms = c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","C","D"))
parcorplot(out2,parms = c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","C","D"))
#dev.off()
This is one way of visualizing the correlations between parameters. One other idea (that we used as well in the previous paper) is to visualize the consequences of those correlations for the estimations of the functional forms
We start by reproducing the Estimated FR for the stochastic FR model with and without the correlations between parameters C and D. Removing the correlations is done by permutations on the values of the Markov chain for both parameters.
We now do the same plot for the deterministic FR model (relying only on count time series data).
Clist = out2$BUGSoutput$sims.list$C
Dlist = out2$BUGSoutput$sims.list$D
n = length(Clist)
ndens = 100
Nprey <- seq(1,20,length=ndens) #density index
FRstoch = matrix(NA,nrow = n, ncol = ndens)
#png('Estimated_FR_withoutFRdata.png',res=300,width=2000,height=1000)
par(mfrow=c(1,2))
library(scales)
for (i in 1:n){
for (dens in 1:length(Nprey))
{
FRstoch[i,dens] = Clist[i]*Nprey[dens]/(Dlist[i]+Nprey[dens])
}
if (i == 1){plot(Nprey,FRstoch[i,],type='l',lwd=0.5,col=alpha('black',0.05),ylab='Functional response',ylim=c(1,3),xlim=c(1,10),xlab='N prey',main='With (C,D) correlations')
}
else {lines(Nprey,FRstoch[i,],type='l',lwd=0.5,col=alpha('black',0.05))}
lines(Nprey,C*Nprey/(D+Nprey),col=alpha('red',1.0))
}
# what if there was no correlation between parameters?
Clist = sample(Clist)
Dlist = sample(Dlist)
for (i in 1:n){
for (dens in 1:length(Nprey))
{
FRstoch[i,dens] = Clist[i]*Nprey[dens]/(Dlist[i]+Nprey[dens])
}
if (i == 1){plot(Nprey,FRstoch[i,],type='l',lwd=0.5,col=alpha('black',0.05),ylab='Functional response',ylim=c(1,3),xlim=c(1,10),xlab='N prey',main='Without (C,D) correlations')
}
else {lines(Nprey,FRstoch[i,],type='l',lwd=0.5,col=alpha('black',0.05))}
lines(Nprey,C*Nprey/(D+Nprey),col=alpha('red',1.0))
}
#dev.off()
Let us now reproduce the density-dependent curves without and without the correlations between parameters. We do this for the prey, results are similar for the predator.
# For the prey
rlist = out$BUGSoutput$sims.list$r_V
Klist = out$BUGSoutput$sims.list$K_V
n = length(rlist)
ndens = 100
Nprey <- seq(1,50,length=ndens) #density index
preyDD = matrix(NA,nrow = n, ncol = ndens)
#png('Estimated_preyDD.png',res=300,width=2000,height=1000)
par(mfrow=c(1,2))
library(scales)
for (i in 1:n){
for (dens in 1:length(Nprey))
{
preyDD[i,dens] = exp(rlist[i])/(1+Nprey[dens]/Klist[i])
}
if (i == 1){plot(Nprey,preyDD[i,],type='l',lwd=0.5,col=alpha('black',0.05),ylab='Prey growth rate',ylim=c(-1,6),xlim=c(1,50),xlab='N prey',main='With (r,K) correlations')
}
else {lines(Nprey,preyDD[i,],type='l',lwd=0.5,col=alpha('black',0.05))}
lines(Nprey,exp(rmax_V)/(1+Nprey/K),col=alpha('red',1.0))
}
# what if there was no correlation between parameters?
rlist = sample(rlist)
Klist = sample(Klist)
for (i in 1:n){
for (dens in 1:length(Nprey))
{
preyDD[i,dens] = exp(rlist[i])/(1+Nprey[dens]/Klist[i])
}
if (i == 1){plot(Nprey,preyDD[i,],type='l',lwd=0.5,col=alpha('black',0.05),ylab='Prey growth rate',ylim=c(-1,6),xlim=c(1,50),xlab='N prey',main='Without (r,K) correlations')
}
else {lines(Nprey,preyDD[i,],type='l',lwd=0.5,col=alpha('black',0.05))}
lines(Nprey,exp(rmax_V)/(1+Nprey/K),col=alpha('red',1.0))
}
#dev.off()
And now for the deterministic FR model
## without FR data
# For the prey
rlist = out2$BUGSoutput$sims.list$r_V
Klist = out2$BUGSoutput$sims.list$K_V
n = length(rlist)
ndens = 100
Nprey <- seq(1,50,length=ndens) #density index
preyDD = matrix(NA,nrow = n, ncol = ndens)
#png('Estimated_preyDD_withoutFRdata.png',res=300,width=2000,height=1000)
par(mfrow=c(1,2))
library(scales)
for (i in 1:n){
for (dens in 1:length(Nprey))
{
preyDD[i,dens] = exp(rlist[i])/(1+Nprey[dens]/Klist[i])
}
if (i == 1){plot(Nprey,preyDD[i,],type='l',lwd=0.5,col=alpha('black',0.05),ylab='Prey growth rate',ylim=c(-1,6),xlim=c(1,50),xlab='N prey',main='With (r,K) correlations')
}
else {lines(Nprey,preyDD[i,],type='l',lwd=0.5,col=alpha('black',0.05))}
lines(Nprey,exp(rmax_V)/(1+Nprey/K),col=alpha('red',1.0))
}
# what if there was no correlation between parameters?
rlist = sample(rlist)
Klist = sample(Klist)
for (i in 1:n){
for (dens in 1:length(Nprey))
{
preyDD[i,dens] = exp(rlist[i])/(1+Nprey[dens]/Klist[i])
}
if (i == 1){plot(Nprey,preyDD[i,],type='l',lwd=0.5,col=alpha('black',0.05),ylab='Prey growth rate',ylim=c(-1,6),xlim=c(1,50),xlab='N prey',main='Without (r,K) correlations')
}
else {lines(Nprey,preyDD[i,],type='l',lwd=0.5,col=alpha('black',0.05))}
lines(Nprey,exp(rmax_V)/(1+Nprey/K),col=alpha('red',1.0))
}
#dev.off()
We see that in all cases, the deterministic FR model performs much worse (in spite of the small noise). Moreover, the correlations between the parameters, that could seem worrying (and be signal non-identifiability), actually allow to produce more precise functional forms for both the functional response and the growth rate curve. Therefore, so long as one is aware of those correlations while looking at the point estimates, they may not be a problem.
Here, we fit the same two models (predator-prey model with stochastic FR vs with deterministic FR) to data simulated with the stochastic interaction model. The analysis is therefore identical to that performed in a Bayesian setting.
We first define the 2 negative log-likelihood (1 with FR data, 1 without) and 2 sum of squares that will be used in the following.
################# Define the likelihood ########################################
logLik=function(theta,y){
#### y is the data with n rows and 3 columns // log-abundance data + FR data
#### Parameters
# theta1 = r
# theta2 = gamma
# theta3 = sigma1
# theta4 = s
# theta5 = q
# theta6 = sigma2
# theta7 = C
# theta8 = D
# theta9 = sigma3
n=nrow(y)
ll = 0.0 ### Or p_1(a_1|x_1) p(x_1)
for (t in 2:n){
N_t = exp(y[t-1,1])
P_t = exp(y[t-1,2])
N_tplus1 = exp(y[t,1]) #useful for the functional response
######## Error that was previously there!! N_t/P_t instead P_t/N_t ###
### mu1 = y[t-1,1] + theta[1] - log(1+theta[2]*N_t) - y[t-1,3]*N_t/P_t
######################################################################
mu1 = y[t-1,1] + theta[1] - logprot(1+theta[2]*N_t) - y[t-1,3]*P_t/N_t #this is correct timing
mu2 = y[t-1,2] + theta[4] - logprot((1+theta[5]*P_t/N_t))
mu3 = (theta[7]*N_tplus1)/(theta[8] + N_tplus1) #easier to update all variables simultaneously, minimizes errors
#ll= ll + log(dnorm(y[t,1], mu1, theta[3])) + log(dnorm(y[t,2], mu2, theta[6])) + log(dnorm(y[t,3], mean = mu3, sd = theta[9]))
# we have log(0) problem
d1=dnorm(y[t,1], mu1, theta[3],log=T) ## directly asking for the log avoids problems
d2=dnorm(y[t,2], mu2, theta[6],log=T)
d3=dnorm(y[t,3], mu3, theta[9],log=T)
ll=ll+d1+d2+d3
}
return(-ll)
}
############### Working directly with the sum of squares ######################
RSS=function(theta,y){
#### y is the data with n rows and 3 columns // log-abundance data + FR data
#### Parameters
# theta1 = r
# theta2 = gamma
# theta3 = s
# theta4 = q
# theta5 = C
# theta6 = D
# not the same theta
n=nrow(y)
rss = 0.0 ### Or p_1(a_1|x_1) p(x_1)
for (t in 2:n){
N_t = exp(y[t-1,1])
P_t = exp(y[t-1,2])
N_tplus1 = exp(y[t,1]) #useful for the functional response
############## Correction of error ###########################################
## mu1 = y[t-1,1] + theta[1] - log(1+theta[2]*N_t) - y[t-1,3]*N_t/P_t # error
mu1 = y[t-1,1] + theta[1] - logprot(1+theta[2]*N_t) - y[t-1,3]*P_t/N_t # corrected
mu2 = y[t-1,2] + theta[3] - logprot((1+theta[4]*P_t/N_t))
mu3 = (theta[5]*N_tplus1)/(theta[6] + N_tplus1)
rss=rss+(y[t,1] - mu1)^2+(y[t,2]-mu2)^2+(y[t,3]-mu3)^2
}
return(rss)
}
############################### LL ##################################################
################# Define the likelihood ########################################
logLik_FRwoutNoise=function(theta,y){
#### y is the data with n rows and 3 columns // log-abundance data + FR data
#### Parameters
# theta1 = r
# theta2 = gamma
# theta3 = sigma1
# theta4 = s
# theta5 = q
# theta6 = sigma2
# theta7 = C
# theta8 = D
# theta9 = sigma3
n=nrow(y)
ll = 0.0 ### Or p_1(a_1|x_1) p(x_1)
for (t in 2:n){
N_t = exp(y[t-1,1])
P_t = exp(y[t-1,2])
######## Error that was previously there!! N_t/P_t instead P_t/N_t ###
### mu1 = y[t-1,1] + theta[1] - log(1+theta[2]*N_t) - y[t-1,3]*N_t/P_t
######################################################################
mu1 = y[t-1,1] + theta[1] - logprot(1+theta[2]*N_t) - (theta[7]*P_t)/(theta[8] + N_t)
mu2 = y[t-1,2] + theta[4] - logprot((1+theta[5]*P_t/N_t))
#ll= ll + log(dnorm(y[t,1], mu1, theta[3])) + log(dnorm(y[t,2], mu2, theta[6])) + log(dnorm(y[t,3], mean = mu3, sd = theta[9]))
# we have log(0) problem
d1=dnorm(y[t,1], mu1, theta[3],log=T) ## directly asking for the log avoids problems
d2=dnorm(y[t,2], mu2, theta[6],log=T)
#d3=dnorm(y[t,3], mu3, theta[9],log=T)
ll=ll+d1+d2#+d3
}
return(-ll)
}
############### Working directly with the sum of squares ######################
RSS_FRwoutNoise=function(theta,y){
#### y is the data with n rows and 3 columns // log-abundance data + FR data
#### Parameters
# theta1 = r
# theta2 = gamma
# theta3 = s
# theta4 = q
# theta5 = C
# theta6 = D
# not the same theta
n=nrow(y)
rss = 0.0 ### Or p_1(a_1|x_1) p(x_1)
for (t in 2:n){
N_t = exp(y[t-1,1])
P_t = exp(y[t-1,2])
############## Correction of error ###########################################
## mu1 = y[t-1,1] + theta[1] - log(1+theta[2]*N_t) - y[t-1,3]*N_t/P_t # error
mu1 = y[t-1,1] + theta[1] - logprot(1+theta[2]*N_t) - (theta[5]*P_t)/(theta[6] + N_t) # corrected
mu2 = y[t-1,2] + theta[3] - logprot((1+theta[4]*P_t/N_t))
rss=rss+(y[t,1] - mu1)^2+(y[t,2]-mu2)^2#+(y[t,3]-mu3)^2
}
return(rss)
}
We then compute the maximum likelihood, starting reasonably away from it.
### We define the data
### Produce dataset to fit
data = cbind(log(N),log(P),FR)
######### Estimation starting at the MLE ###########
theta_true = c(rmax_V,1/K,sqrt(0.05),rmax_P,Q,sqrt(0.05),C,D,sqrt(0.05))
#theta_init = theta_true + rnorm(9,0,sd=0.05) # very small errors
### Crafting a new theta-init with reasonable a priori errors
theta_init = c(3,0.5,0.75,1,5,0.75,4,5,0.75)
p_opt<-optim(theta_init, logLik, y=data,method="BFGS",hessian=T)
p_opt$par
## [1] 2.0298182 1.0349005 0.2225043 0.5067403 10.3093055 0.2314224
## [7] 2.5146359 1.0202974 0.2194918
theta_true
## [1] 2.0000000 1.0000000 0.2236068 0.5000000 10.0000000 0.2236068
## [7] 2.5000000 1.0000000 0.2236068
hessian=p_opt$hessian
eigen(hessian)$values ## clearly no zeroes there?
## [1] 41477.939825 40362.281312 37310.999154 33665.713773 18679.901299
## [6] 16862.113820 75.108327 12.352505 1.498253
lambda1 = max(eigen(hessian)$values)
9*lambda1*10^(-9) #way below lowest eigenvalue
## [1] 0.0003733015
Looking at the correlations in the matrix
Sigma =solve(hessian) #variance-covariance matrix
Rho = cov2cor(Sigma) #correlation matrix
library(corrplot)
par(mfrow=c(1,1))
rownames(Rho) = c("r_V","1/K_V","sigma1","r_P","Q","sigma2","C","D","sigma3")
colnames(Rho) = c("r_V","1/K_V","sigma1","r_P","Q","sigma2","C","D","sigma3")
corrplot(Rho, method="circle")
corrplot(Rho, method = "number")
# There are significant correlations between pairs of parameters (same as in the Bayesian analysis)
We recover the same correlations as for the Bayesian analysis. Let’s see if we can get rid of the sigma parameters that may make the optimization more difficult while adding little to the estimation of the other parameters:
######## Estimation based on RSS ##################
### New restricted theta
theta_true = c(rmax_V,1/K,rmax_P,Q,C,D)
#theta_init = theta_true + rnorm(6,0,sd=0.05)
theta_init = c(3,0.5,1,5,4,5)
p_opt<-optim(theta_init, RSS, y=data,hessian=T)
p_opt$par
## [1] 2.9229583 2.8012547 0.4128283 7.9391372 2.5043341 0.9945316
theta_true
## [1] 2.0 1.0 0.5 10.0 2.5 1.0
theta_init
## [1] 3.0 0.5 1.0 5.0 4.0 5.0
eigen(p_opt$hessian)$values ### clearly interesting and non zero
## [1] 2221.3231218 2001.4856462 1638.0325684 7.4255103 0.2789953
## [6] -0.1160630
From this we can get (approximate) confidence intervals [which we will plot in a similar way to the Bayesian a posteriori estimates]. [do we use approximate Gaussian or smthg else? Some people use \(\chi_2\) based on the distribution on the deviance scale.]
###################################################################################################
################### Now for the model where the FR is specified without the noise #################
###################################################################################################
# Revert back to full theta_true with 9 elements
# Nope!! There are only 8 elements here, no sigma3!!
theta_true = c(rmax_V,1/K,sqrt(0.05),rmax_P,Q,sqrt(0.05),C,D)
### Crafting a new theta-init with reasonable a priori errors
theta_init = c(3,0.5,0.75,1,5,0.75,4,5)
p_opt1<-optim(theta_init, logLik_FRwoutNoise,method="BFGS", y=data,hessian=T)
p_opt1$par
## [1] 1.5718984 0.5730893 0.2226600 0.5067040 10.3083510 0.2314212
## [7] 26.0655759 53.0488133
theta_true
## [1] 2.0000000 1.0000000 0.2236068 0.5000000 10.0000000 0.2236068
## [7] 2.5000000 1.0000000
theta_init
## [1] 3.00 0.50 0.75 1.00 5.00 0.75 4.00 5.00
eigen(p_opt1$hessian)$values ### BFGS
## [1] 5.505953e+04 4.030562e+04 3.731199e+04 1.868010e+04 4.707488e+01
## [6] 1.498589e+00 8.201987e-02 1.141953e-04
lambda1 = max(eigen(p_opt1$hessian)$values)
8*lambda1*10^(-9) ### 0.0005090474 -- close to zero. Vallefond et al.'s criterion.
## [1] 0.0004404762
diag(solve(p_opt1$hessian)) ## positive values though
## [1] 2.021508e-02 1.120156e-02 2.481616e-05 1.005341e-03 6.663426e-01
## [6] 2.680104e-05 1.476976e+03 7.292133e+03
### Looks like BFGS performs worse than Nelder-Mead without the FR data (see below)
p_opt2<-optim(theta_init, logLik_FRwoutNoise, y=data,hessian=T)
p_opt2$par
## [1] 2.1948736 1.2279028 0.2206843 0.3550715 6.5116240 0.2309290 4.5669852
## [8] 5.1584874
eigen(p_opt2$hessian)$values ### Nelder-Mead by default
## [1] 42784.648389 39426.476918 30604.638451 18749.117486 46.054877
## [6] 3.689130 2.639528 -0.111515
## Beware that we use -LL, so Hessian = observed FIM (Fisher Information Matrix)
## Should we loop over datasets to obtain the true, expected FIM?
We plot these for two parameters at a time, based on the pairs of correlated parameters identified in the preceding section.
theta_true = c(rmax_V,1/K,sqrt(0.05),rmax_P,Q,sqrt(0.05),C,D,sqrt(0.05))
par(mfrow=c(1,1))
# basic informal check for C and D
logLik(theta_true,data)
##
## -223.5108
### Let's make that zoom precise
DeltaC = 0.25 # C is +/- 2
DeltaD = 0.5 #D is +/-0.9
niter = 50
C_new=D_new=rep(0,niter)
llbis=matrix(0,nrow=niter,ncol=niter)
for (i in 1:niter){
for (j in 1:niter){
C_new[i] =2*DeltaC*i/niter-DeltaC
D_new[j] = 2*DeltaD*j/niter-DeltaD
theta_new=theta_true+c(0,0,0,0,0,0,C_new[i],D_new[j] ,0)
llbis[i,j]=logLik(theta_new,data)
}
}
#hist(llbis) ## what values are in there
min(llbis)
## [1] -223.7919
custom_levels=c(500,100,0,-50,-200,-210,-215,-220,-225)
#contour(theta_true[7]+C_new,theta_true[8]+D_new,llbis,nlevels=20,xlab="C",ylab="D")
contour(theta_true[7]+C_new,theta_true[8]+D_new,llbis,levels=custom_levels,xlab="C",ylab="D")
### Try to vizualise this using a 3D surface plot
library(rgl)
interval = (niter/2-10):(niter/2+10)
custom_levels=c(1000,500,100,50,0,-50,-100,-150,-200,-210,-215,-220)
persp3d(theta_true[7]+C_new[interval],theta_true[8]+D_new[interval],-llbis[interval,interval], col="skyblue")
### Maximum not visible on the surface.
It is not extremely well defined but there is a maximum
################# Now with (r,gamma=1/K)
logLik(theta_true,data)
##
## -223.5108
niter = 50
r_new=g_new=rep(0,niter)
llbis=matrix(0,nrow=niter,ncol=niter)
Deltar = 1.5 # rmax_V is +/- 1.5
Deltag = 0.9 #1/K is +/- 0.9
for (i in 1:niter){
for (j in 1:niter){
r_new[i] =2*Deltar*i/niter-Deltar
g_new[j] = 2*Deltag*j/niter-Deltag
theta_new=theta_true+c(r_new[i],g_new[j],0,0,0,0,0,0)
llbis[i,j]=logLik(theta_new,data)
}
}
#contour(theta_true[1]+r_new,theta_true[2]+g_new,llbis,nlevels=50,xlab="r",ylab="1/K")
#changing the representation of the levels
hist(llbis) ## what values are in there
custom_levels=c(80000,60000,40000,20000,5000,1000,500,100,50,0,-10,-20-30)
contour(theta_true[1]+r_new,theta_true[2]+g_new,llbis,levels=custom_levels,xlab="r",ylab="1/K")
Same here, there is a maximum even though it is along a ridge.
########### Other profiles #####################################
# We have (r_V, 1/K) and (C,D) - we also need (r_P,Q)
################# Now with (r,gamma=1/K)
niter = 50
rP_new=q_new=rep(0,niter)
llbis=matrix(0,nrow=niter,ncol=niter)
Deltar = 0.3 # rmax_P is +/- 0.3
Deltaq = 2 # Q is +/- 2
for (i in 1:niter){
for (j in 1:niter){
rP_new[i] =2*Deltar*i/niter-Deltar
q_new[j] = 2*Deltaq*j/niter-Deltaq
theta_new=theta_true+c(0,0,0,rP_new[i],q_new[j],0,0,0,0)
llbis[i,j]=logLik(theta_new,data)
}
}
contour(theta_true[4]+rP_new,theta_true[5]+q_new,llbis,nlevels=50,xlab="rP",ylab="Q")
#changing the representation of the levels
hist(llbis) ## what values are in there
min(llbis)
## [1] -223.8116
custom_levels=quantile(llbis,probs=c(0.025,0.05,0.075,0.1,0.25,0.5,0.75),na.rm=T)
contour(theta_true[4]+rP_new,theta_true[5]+q_new,llbis,levels=custom_levels,xlab="rP",ylab="Q")
We can find similar results by working directly with the sum of squares.
(I have previously called these 2D plots likelihood profiles but may have be wrong about this since “profiles” refer to ridgelines, cf. Bolker EMDB chap. 6 p. 186. A likelihood profile is computed by fixing one parameter, say r, then optimizing -LL for all other parameters, and finally plotting the optimized -LL against r. If it’s flat, then this parameter is not identifiable. There’s some debate in identifiability papers (cf. Raue et al. 2009; Kao & Eisenberg 2018) about how useful are these profiles.)
As does Bolker p. 190 Fig. 6.8, we can fit both the likelihood “slice” and the likelihood profile on the same plot. [Btw, he does that for the attack rate of a functional response!]
The model was parameterized so far as often done in the literature (e.g. Weide et al.) but not exactly in terms of carrying capacities for the prey (this can be found by computing the equilibrium population size for the Beverton-Holt model, which is not equal to K in the previous formulation). Here we attempt a reparameterization to decrease the correlation between pairs of parameters belonging to the same function in the model. In this new formulation, we have \(N^*=K\) for the prey when considered alone and \(P^* = N/q\) for the predator alone if the prey is fixed. We also consider a classic \((a,h)\) parameterization of the functional response.
\[ \begin{aligned} N_{t+1} & = N_{t}\frac{e^{r+\epsilon_{1t}}}{1+(e^r-1) N_{t}/K}\exp\left(-G(N_{t})\frac{P_{t}}{N_{t}}\right),\,\epsilon_{1t}\sim\mathcal{N}(0,\sigma_{1}^{2})\label{eq:prey_discreteLeslieMay_reparam} \end{aligned} \]
and
\[ \begin{aligned} P_{t+1} & = P_{t}\frac{e^{s+\epsilon_{2t}}}{1+(e^s-1)qP_{t}/N_{t}},\,\epsilon_{1t}\sim\mathcal{N}(0,\sigma_{2}^{2})\label{eq:predator_discreteLeslieMay_reparam} \end{aligned} \]
Let us first produce a few diagnostics, starting with posterior densities
For the stochastic FR model
denplot(out,c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","a","h"))
For the deterministic FR model
denplot(out2,c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","a","h"))
Now Trace plots
#png(file="TracePlot_withFRdata_reparam.png", width = 1200, height = 1200,res=300)
traplot(out,c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","a","h"))
#dev.off()
#png(file="TracePlot_withoutFRdata_reparam.png", width = 1200, height = 1200,res=300)
traplot(out2,c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","a","h"))
#dev.off()
We now examine correlations for the stochastic FR model
### Plot pair posterior densities
postsamples=cbind(out$BUGSoutput$sims.list$r_V,
out$BUGSoutput$sims.list$K_V,
out$BUGSoutput$sims.list$r_P,
out$BUGSoutput$sims.list$Q,
out$BUGSoutput$sims.list$a,
out$BUGSoutput$sims.list$h)
#png(file="PairPosteriorPlot_withFRdata_reparam.png", width = 1200, height = 1200,res=300)
pairs(postsamples,c("r_V","K_V","r_P","Q","a","h"))
#dev.off()
and for the deterministic FR model
postsamples2=cbind(out2$BUGSoutput$sims.list$r_V,
out2$BUGSoutput$sims.list$K_V,
out2$BUGSoutput$sims.list$r_P,
out2$BUGSoutput$sims.list$Q,
out2$BUGSoutput$sims.list$a,
out2$BUGSoutput$sims.list$h)
#png(file="PairPosteriorPlot_withoutFRdata_reparam.png", width = 1200, height = 1200, res=300)
pairs(postsamples2,c("r_V","K_V","r_P","Q","a","h"))
#dev.off()
Let us consider other ways of plotting
#pdf(file="PairCorrelPosteriorPlot_reparam.pdf",width = 4,height = 8)
par(mfrow=c(2,1))
parcorplot(out,parms = c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","a","h"))
parcorplot(out2,parms = c("r_V","K_V","r_P","Q","sigma2_V","sigma2_P","a","h"))
#dev.off()
We now reproduce the FR curve without and without the correlations between parameters. The red line is again the true functional response.
For the stochastic FR model
and for the model without the FR data
alist = out2$BUGSoutput$sims.list$a
hlist = out2$BUGSoutput$sims.list$h
n = length(Clist)
ndens = 100
Nprey <- seq(1,20,length=ndens) #density index
FRstoch = matrix(NA,nrow = n, ncol = ndens)
#png('Estimated_FR_withoutFRdata_reparam.png',res=300,width=2000,height=1000)
par(mfrow=c(1,2))
library(scales)
for (i in 1:n){
for (dens in 1:length(Nprey))
{
FRstoch[i,dens] = alist[i]*Nprey[dens]/(1.0+alist[i]*hlist[i]*Nprey[dens])
}
if (i == 1){plot(Nprey,FRstoch[i,],type='l',lwd=0.5,col=alpha('black',0.05),ylab='Functional response',ylim=c(1,3),xlim=c(1,10),xlab='N prey',main='With (a,h) correlations')
}
else {lines(Nprey,FRstoch[i,],type='l',lwd=0.5,col=alpha('black',0.05))}
lines(Nprey,a*Nprey/(1+a*h*Nprey),col=alpha('red',1.0))
}
# what if there was no correlation between parameters?
alist = sample(alist)
hlist = sample(hlist)
for (i in 1:n){
for (dens in 1:length(Nprey))
{
FRstoch[i,dens] = alist[i]*Nprey[dens]/(1.0+alist[i]*hlist[i]*Nprey[dens])
}
if (i == 1){plot(Nprey,FRstoch[i,],type='l',lwd=0.5,col=alpha('black',0.05),ylab='Functional response',ylim=c(1,3),xlim=c(1,10),xlab='N prey',main='Without (a,h) correlations')
}
else {lines(Nprey,FRstoch[i,],type='l',lwd=0.5,col=alpha('black',0.05))}
lines(Nprey,a*Nprey/(1+a*h*Nprey),col=alpha('red',1.0))
}
#dev.off()
Now we do this for the density-dependent curves without and without the correlations between parameters
and without FR data
We see that the reparameterization might have helped to make these curves a little more precise, though not overly so. The parameter values and hence the model behaviour is rigorously identical to that of the first parameterization.
It seems that the model is globally identifiable when FR data is present. However, for the first parameterization, pairs of parameters belonging to the same functional forms of the models are highly correlated. These pairs are, respectively, (r K), (s, Q), and (C,D). Plots of the functions realized for each parameter pair show that such correlations are substantial for T=1000.
Reparameterizing so that K and Q are more closely related to carrying capacities improved substantially decreased parameter correlations [I would not go as far as to say it improved identifiability, since the models seemed to be already identifiable in the first place]. Whilst Bolker (EMDB book, p. ) suggested that a Michaelis-Menten formulation (\(CN/(D+N)\)) of the FR may improve identifiability compared to the classical \((a,h)\) formulation (which was our guess as well), we did not find this here: correlation seems slightly lower of the \((a,h)\) pair.
[Another issue that we may want to convey here: Bolker mentions in his book p. 200 that “all other things being equal, smaller confidence regions (i.e, for larger and less noisy datasets and for higher alpha levels) are more elliptical”. Meaning that the less noise there is and the longer the time series, the more correlations we get between the parameters. Could that be true here? Check T=100 and less.]
The absence of FR data substantially compromises identifiability. This is all the more important that we consider here a fairly small noise on the functional response and relatively rich datasets. [more here based on what we find – do profile likelihoods for these?]
(perhaps discussion of the work by Cole et al. 2010 – exhaustive summaries are difficult to obtain here so we cannot do symbolic work. Discuss also the subset profiling of Eisenberg & Hayashi (2014))
[BFGS vs Nelder-Mead]
[Try likelihood surfaces and profiles for the case without FR data??]
[try model with noise on the FR even without FR data? I think I tried this once and it did not work very well, but on the other hand we had some bugs in the code]
Large noise cases, added (temporal) variability in parameters \(C\) or \(D\)
Barraquand, F. & Nielsen, Ò.K. (2018). Predator-prey feedback in a gyrfalcon-ptarmigan system? Ecology and Evolution, 8, 12425–12434.
Besbeas, P., Freeman, S.N., Morgan, B.J. & Catchpole, E.A. (2002). Integrating mark–recapture–recovery and census data to estimate animal abundance and demographic parameters. Biometrics, 58, 540–547.
Bolker, B. (2008). Ecological models and data in r. Princeton University Press.
Cole, D.J., Morgan, B.J. & Titterington, D. (2010). Determining the parametric structure of models. Mathematical biosciences, 228, 16–30.
DeLong, J.P., Hanley, T.C., Gibert, J.P., Puth, L.M. & Post, D.M. (2018). Life history traits and functional processes generate multiple pathways to ecological stability. Ecology, 99, 5–12.
Eisenberg, M.C. & Hayashi, M.A. (2014). Determining identifiable parameter combinations using subset profiling. Mathematical biosciences, 256, 116–126.
Ferguson, J.M., Hopkins III, J.B. & Witteveen, B.H. (2018). Integrating abundance and diet data to improve inferences of food web dynamics. Methods in Ecology and Evolution, 9, 1581–1591.
Fussmann, G. & Blasius, B. (2005). Community response to enrichment is highly sensitive to model structure. Biology Letters, 1, 9–12.
Gilioli, G., Pasquali, S. & Ruggeri, F. (2008). Bayesian inference for functional response in a stochastic predator–prey system. Bulletin of Mathematical Biology, 70, 358–381.
Gilioli, G., Pasquali, S. & Ruggeri, F. (2012). Nonlinear functional response parameter estimation in a stochastic predator-prey model. Math. Biosci. Eng, 9, 75–96.
Jost, C. & Arditi, R. (2001). From pattern to process: Identifying predator–prey models from time-series data. Population Ecology, 43, 229–243.
Kao, Y.-H. & Eisenberg, M.C. (2018). Practical unidentifiability of a simple vector-borne disease model: Implications for parameter estimation and intervention assessment. Epidemics, 25, 89–100.
Leslie, P. (1948). Some further notes on the use of matrices in population mathematics. Biometrika, 35, 213–245.
Leslie, P. & Gower, J. (1960). The properties of a stochastic model for the predator-prey type of interaction between two species. Biometrika, 47, 219–234.
Maunder, M.N. (2004). Population viability analysis based on combining bayesian, integrated, and hierarchical analyses. Acta Oecologica, 26, 85–94.
May, R. (1973). Stability and complexity in model ecosystems. Princeton University Press, Princeton, USA.
Niu, S., Luo, Y., Dietze, M.C., Keenan, T.F., Shi, Z. & Li, J.et al. (2014). The role of data assimilation in predictive ecology. Ecosphere, 5, art65.
Péron, G. & Koons, D.N. (2012). Integrated modeling of communities: Parasitism, competition, and demographic synchrony in sympatric ducks. Ecology, 93, 2456–2464.
Planque, B., Lindstrøm, U. & Subbey, S. (2014). Non-deterministic modelling of food-web dynamics. PloS one, 9, e108243.
Raue, A., Kreutz, C., Maiwald, T., Bachmann, J., Schilling, M. & Klingmüller, U.et al. (2009). Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood. Bioinformatics, 25, 1923–1929.
Rosenbaum, B. & Rall, B.C. (2018). Fitting functional responses: Direct parameter estimation by simulating differential equations. Methods in Ecology and Evolution, 9, 2076–2090.
Rosenbaum, B., Raatz, M., Weithoff, G., Fussmann, G.F. & Gaedke, U. (2019). Estimating parameters from multiple time series of population dynamics using bayesian inference. Frontiers in Ecology and Evolution, 6, 234.
Tanner, J. (1975). The stability and the intrinsic growth rates of prey and predator populations. Ecology, 56, 855–867.
Turchin, P. & Ellner, S. (2000). Living on the edge of chaos: population dynamics of Fennoscandian voles. Ecology, 81, 3099–3116.
Turchin, P. & Hanski, I. (1997). An empirically based model for latitudinal gradient in vole population dynamics. American Naturalist, 149, 842–874.
Weide, V., Varriale, M.C. & Hilker, F.M. (2018). Hydra effect and paradox of enrichment in discrete-time predator-prey models. Mathematical Biosciences.